Supervised Word Sense Disambiguation with Support Vector Machines and multiple knowledge sources

نویسندگان

  • Yoong Keok Lee
  • Hwee Tou Ng
  • Tee Kiah Chia
چکیده

We participated in the SENSEVAL-3 English lexical sample task and multilingual lexical sample task. We adopted a supervised learning approach with Support Vector Machines, using only the official training data provided. No other external resources were used. The knowledge sources used were partof-speech of neighboring words, single words in the surrounding context, local collocations, and syntactic relations. For the translation and sense subtask of the multilingual lexical sample task, the English sense given for the target word was also used as an additional knowledge source. For the English lexical sample task, we obtained fine-grained and coarse-grained score (for both recall and precision) of 0.724 and 0.788 respectively. For the multilingual lexical sample task, we obtained recall (and precision) of 0.634 for the translation subtask, and 0.673 for the translation and sense subtask.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Empirical Evaluation of Knowledge Sources and Learning Algorithms for Word Sense Disambiguation

In this paper, we evaluate a variety of knowledge sources and supervised learning algorithms for word sense disambiguation on SENSEVAL-2 and SENSEVAL-1 data. Our knowledge sources include the part-of-speech of neighboring words, single words in the surrounding context, local collocations, and syntactic relations. The learning algorithms evaluated include Support Vector Machines (SVM), Naive Bay...

متن کامل

Kernel Methods for Word Sense Disambiguation and Acronym Expansion

The scarcity of manually labeled data for supervised machine learning methods presents a significant limitation on their ability to acquire knowledge. The use of kernels in Support Vector Machines (SVMs) provides an excellent mechanism to introduce prior knowledge into the SVM learners, such as by using unlabeled text or existing ontologies as additional knowledge sources. Our aim is to develop...

متن کامل

USP-IBM-1 and USP-IBM-2: The ILP-based Systems for Lexical Sample WSD in SemEval-2007

We describe two systems participating of the English Lexical Sample task in SemEval2007. The systems make use of Inductive Logic Programming for supervised learning in two different ways: (a) to build Word Sense Disambiguation (WSD) models from a rich set of background knowledge sources; and (b) to build interesting features from the same knowledge sources, which are then used by a standard mod...

متن کامل

It Makes Sense: A Wide-Coverage Word Sense Disambiguation System for Free Text

In this paper, we present IMS, a supervised English all-words WSD system. The flexible framework of IMS allows users to integrate different preprocessing tools, additional features, and different classifiers. By default, we use linear support vector machines as the classifier with multiple knowledge-based features. In our implementation, IMS achieves state-of-the-art results on several SensEval...

متن کامل

The WSD Development Environment

In this paper we present the Word Sense Disambiguation Development Environment (WSDDE), a platform for testing various Word Sense Disambiguation (WSD) technologies, as well as the results of first experiments in applying the platform to WSD in Polish. The current development version of the environment facilitates the construction and evaluation of WSD methods in the supervised Machine Learning ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004